27 research outputs found
Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions
Comment sections below online news articles enjoy growing popularity among
readers. However, the overwhelming number of comments makes it infeasible for
the average news consumer to read all of them and hinders engaging discussions.
Most platforms display comments in chronological order, which neglects that
some of them are more relevant to users and are better conversation starters.
In this paper, we systematically analyze user engagement in the form of the
upvotes and replies that a comment receives. Based on comment texts, we train a
model to distinguish comments that have either a high or low chance of
receiving many upvotes and replies. Our evaluation on user comments from
TheGuardian.com compares recurrent and convolutional neural network models, and
a traditional feature-based classifier. Further, we investigate what makes some
comments more engaging than others. To this end, we identify engagement
triggers and arrange them in a taxonomy. Explanation methods for neural
networks reveal which input words have the strongest influence on our model's
predictions. In addition, we evaluate on a dataset of product reviews, which
exhibit similar properties as user comments, such as featuring upvotes for
helpfulness.Comment: Accepted at the International Conference on Web and Social Media
(ICWSM 2020); 11 pages; code and data are available at
https://hpi.de/naumann/projects/repeatability/text-mining.htm
ssHMM: extracting intuitive sequence-structure motifs from high-throughput RNA-binding protein data
RNA-binding proteins (RBPs) play an important role in RNA post-transcriptional
regulation and recognize target RNAs via sequence-structure motifs. The extent
to which RNA structure influences protein binding in the presence or absence
of a sequence motif is still poorly understood. Existing RNA motif finders
either take the structure of the RNA only partially into account, or employ
models which are not directly interpretable as sequence-structure motifs. We
developed ssHMM, an RNA motif finder based on a hidden Markov model (HMM) and
Gibbs sampling which fully captures the relationship between RNA sequence and
secondary structure preference of a given RBP. Compared to previous methods
which output separate logos for sequence and structure, it directly produces a
combined sequence-structure motif when trained on a large set of sequences.
ssHMM’s model is visualized intuitively as a graph and facilitates biological
interpretation. ssHMM can be used to find novel bona fide sequence-structure
motifs of uncharacterized RBPs, such as the one presented here for the YY1
protein. ssHMM reaches a high motif recovery rate on synthetic data, it
recovers known RBP motifs from CLIP-Seq data, and scales linearly on the input
size, being considerably faster than MEMERIS and RNAcontext on large datasets
while being on par with GraphProt. It is freely available on Github and as a
Docker image
Do We Need Another Explainable AI Method? Toward Unifying Post-hoc XAI Evaluation Methods into an Interactive and Multi-dimensional Benchmark
In recent years, Explainable AI (xAI) attracted a lot of attention as various
countries turned explanations into a legal right. xAI allows for improving
models beyond the accuracy metric by, e.g., debugging the learned pattern and
demystifying the AI's behavior. The widespread use of xAI brought new
challenges. On the one hand, the number of published xAI algorithms underwent a
boom, and it became difficult for practitioners to select the right tool. On
the other hand, some experiments did highlight how easy data scientists could
misuse xAI algorithms and misinterpret their results. To tackle the issue of
comparing and correctly using feature importance xAI algorithms, we propose
Compare-xAI, a benchmark that unifies all exclusive functional testing methods
applied to xAI algorithms. We propose a selection protocol to shortlist
non-redundant functional tests from the literature, i.e., each targeting a
specific end-user requirement in explaining a model. The benchmark encapsulates
the complexity of evaluating xAI methods into a hierarchical scoring of three
levels, namely, targeting three end-user groups: researchers, practitioners,
and laymen in xAI. The most detailed level provides one score per test. The
second level regroups tests into five categories (fidelity, fragility,
stability, simplicity, and stress tests). The last level is the aggregated
comprehensibility score, which encapsulates the ease of correctly interpreting
the algorithm's output in one easy to compare value. Compare-xAI's interactive
user interface helps mitigate errors in interpreting xAI results by quickly
listing the recommended xAI solutions for each ML task and their current
limitations. The benchmark is made available at
https://karim-53.github.io/cxai
Validation of Tagging Suggestion Models for a Hotel Ticketing Corpus
This paper investigates methods for the prediction of tags on a textual corpus that describes hotel staff inputs in a ticketing system. The aim is to improve the tagging process and find the most suitable method for suggesting tags for a new text entry. The paper consists of two parts: (i) exploration of existing sample data, which includes statistical analysis and visualisation of the data to provide an overview, and (ii) evaluation of tag prediction approaches. We have included different approaches from different research fields in order to cover a broad spectrum of possible solutions. As a result, we have tested a machine learning model for multi-label classification (using gradient boosting), a statistical approach (using frequency heuristics), and two simple similarity-based classification approaches (Nearest Centroid and k-Nearest Neighbours). The experiment which compares the approaches uses recall to measure the quality of results. Finally, we provide a recommendation of the modelling approach which produces the best accuracy in terms of tag prediction on the sample data
Diversifying Product Review Rankings : Getting the Full Picture
E-commerce Web sites owe much of their popularityto consumer reviews provided together with product descriptions.On-line customers spend hours and hours going through heaps oftextual reviews to build confidence in products they are planningto buy. At the same time, popular products have thousands ofuser-generated reviews. Current approaches to present them tothe user or recommend an individual review for a product arebased on the helpfulness or usefulness of each review. In thispaper we look at the top-k reviews in a ranking to give a goodsummary to the user with each review complementing the others.To this end we use Latent Dirichlet Allocation to detect latenttopics within reviews and make use of the assigned star ratingfor the product as an indicator of the polarity expressed towardsthe product and the latent topics within the review. We present aframework to cover different ranking strategies based on theuser’s need: Summarizing all reviews; focus on a particularlatent topic; or focus on positive, negative or neutral aspects.We evaluated the system using manually annotated review datafrom a commercial review Web site.Winner of best paper award at 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.QC 2012021
Diversifying Product Review Rankings : Getting the Full Picture
E-commerce Web sites owe much of their popularityto consumer reviews provided together with product descriptions.On-line customers spend hours and hours going through heaps oftextual reviews to build confidence in products they are planningto buy. At the same time, popular products have thousands ofuser-generated reviews. Current approaches to present them tothe user or recommend an individual review for a product arebased on the helpfulness or usefulness of each review. In thispaper we look at the top-k reviews in a ranking to give a goodsummary to the user with each review complementing the others.To this end we use Latent Dirichlet Allocation to detect latenttopics within reviews and make use of the assigned star ratingfor the product as an indicator of the polarity expressed towardsthe product and the latent topics within the review. We present aframework to cover different ranking strategies based on theuser’s need: Summarizing all reviews; focus on a particularlatent topic; or focus on positive, negative or neutral aspects.We evaluated the system using manually annotated review datafrom a commercial review Web site.Winner of best paper award at 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology.© 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.QC 2012021
Explainable AI under contract and tort law
This paper shows that the law, in subtle ways, may set hitherto unrecognized incentives for the adoption of explainable machine learning applications. In doing so, we make two novel contributions. First, on the legal side, we show that to avoid liability, professional actors, such as doctors and managers, may soon be legally compelled to use explainable ML models. We argue that the importance of explainability reaches far beyond data protection law, and crucially influences questions of contractual and tort liability for the use of ML models. To this effect, we conduct two legal case studies, in medical and corporate merger applications of ML. As a second contribution, we discuss the (legally required) trade-off between accuracy and explainability and demonstrate the effect in a technical case study in the context of spam classification.Peer Reviewe